obhavefandomcom-20200213-history
OBHave Application / Content Item Set Container
Content Item Set Container is the highlevel UX-component that controls the Content Item Set component. In a sense it is very similar to the Netflix view that lists all the movie genres. The Container sends user id to the OBHave platform and receives a list of Content Item Sets and their Content Items in prioritized order. Content Item Set Container (from now on CISC) listens down scroll event and lazy loads more Content Item Sets (CIS) if needed and submits related user behavrio information back to OBHave. CIS resembles a traditional carousel component. In Flux architecture it accommodates the data store, which CISs use to arrange their Content Items (CI) in prioritized order. When user triggers a navigation event (Flux Action) e.g. CIS vertical scroll as in Netflix or a CI preview, the DataStore will rearrange the order of CIS and their CIs for content that the user hasn't seen yet (not necessarily lazy loaded, but lazy rendered). The lazy rendering makes the user interface more reactive to the users current mood. If you load 15 CIS with twenty five CIs each, the maximum total CIs loaded will be 375 CIs. If front page can show three CIS each showing five CI we have pool of 360 CIs which we can rearrange by user behavior before the first navigation event (down or vertical scroll). Netflix and many other recommendation systems are far less responsive, because they don't lazy load enough; this might have been a scalability problem, but as you see, the performance penalty can be move to users browser. Of course lazy loading might induce performance problems in some cases. We will optimize this by loading certain navigational distance and lazy loading the rest (e.g. one vertical scroll dimension and one horizontal dimension, total of 30 CIs + 15 CIs at the front page, we get remaining pool of 330 CIs). Further Development Ideas for Mood Strategies and Perfromance Optimization Since the problem is that we don't want lazy loaded content too close to the users view, we still could micro rearrange the non-lazy loaded CIs separately, because React performs quite well for such tasks. If we have navigational distance of two with the previous setup, we would have pool of 75 pre-loaded CIs (vertical-scroll vertical-scroll, down-scroll vertical-scroll, down-scroll down-scroll), which we can rearrange if the user previews an item or when the first navigation event is done. We could also extend the mood based recommendation by storing the CIs to the local storage with expiration time, so frequent users would get better recommendations and cause less burden to the server. Even better, the initial load which contains the list of CI ids could be provided with expiration timestamps to enable purge features and further extend the default expiration times and further enhance the performance of the server and recommendation. Mood strategies will learn from failures (navigational distance to consumed CI is longer than the non-lazy loaded distance), but there is no need to send additional data to OBHave, because this can be deduced from the navigation data that is linked to the consumed CI, which OBHave tracks by default. Mood strategies are a user group specific feature, because it would be computationally impossible to make it learn in an atomic user environment. The "Math" of Mood Strategy The mood strategy (or set of mood strategies, because competitive algorithms are less prone to becoming biased and self enforcing) would be ranked by it's ability to match users behavior when it is used (navigational distance of lazy loading). OBHave will contain last_run time stamp about the learning process of the mood strategies, based on this it will be evaluated how well it recommended content items to the user after the last_run by calculating the probabilities of it recommending the content items which user did consume after the last_run. This is done by selecting the CISs that caused failures of providing user consumed content before "lazy load land" or first view (depending on configuration etc.). The rules (which I call Tags) are adjusted in an effort to find maximum probablility that the user could had seen the consumed content items sooner with less navigation based on the navigation events the user did before finding the consumed content item (e.g. previews, vertical scrolling, downward scrolling). Rules try to implement "message information" as used within Entropy of Information Theory. For example "user previewed CI with similar average user group rating than the consumed CI". Rules themselves gain popularity amogn mood strategies by their frequency in successful recommendations (users tend to choose similarly rated content after previewing another content) and the popularity counts as probability of the rule message existing in the user behavior data (Entropy). Mood strategy tries different rules with Reinforcement Learning, where Utility of each rule is based on the formula: probability value based on the Entropy TIMES the ultimate goal of successfully recommending maximal amount of consumed content early. The ultimate goal is modeled as the sum of consumed CIs, which are weighted by the amount of users that consumed the CI within a user group AND chosen from the CIs consumed after the last_run. Example: Two users belong to user group and have consumed two CIs after last_run that failed to be shown early. User 1 has consumed one of the items and user 2 has consumed both. If new rule can match CI consumed by both users, the Reward of mood strategy is 2, if it matches the other Reward is 1 and in case of both the reward is 3. If two rules match both, the one with higher user behavior probability (Entropy) will be chosen. Improved Scalability of Reinforcement Learning by Weighted Negative Bias Entropy For performance optimization the rules have Entropies for Intra User Group (local), Inter User Group (semi-global) and global contexts. In traditional Reinforcement Learning the number of evaluated strategies is constant. This doesn't scale well. In my formula, the learning is started from strategy set of local strategies and grows toward the global strategies. In other words, when choosing which strategies to evaluate, a negative bias is added based on the distance of the strategy. The strategies which do not score above zero will be eliminated. The strategys probability (Entropy) is multiplied by failure rate of the chosen strategy set: number of iterations used * distance from the maximum Reward. Distance from the maximum Reward is a number between zero and one and the number of iterations is scaled by the server load and length of the machine learning task queue.